Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Large language models (LLMs) are being increasingly deployed as part of pipelines that repeatedly process or generate data of some sort. However, a common barrier to deployment are the frequent and often unpredictable errors that plague LLMs. Acknowledging the inevitability of these errors, we proposedata quality assertionsto identify when LLMs may be making mistakes. We present spade, a method for automatically synthesizing data quality assertions that identify bad LLM outputs. We make the observation that developers often identify data quality issues during prototyping prior to deployment, and attempt to address them by adding instructions to the LLM prompt over time. spade therefore analyzes histories of prompt versions over time to create candidate assertion functions and then selects a minimal set that fulfills both coverage and accuracy requirements. In testing across nine different real-world LLM pipelines, spade efficiently reduces the number of assertions by 14% and decreases false failures by 21% when compared to simpler baselines. spade has been deployed as an offering within LangSmith, LangChain's LLM pipeline hub, and has been used to generate data quality assertions for over 2000 pipelines across a spectrum of industries.more » « less
-
Abstract Understanding the scope, prevalence, and impact of the COVID-19 pandemic response will be a rich ground for research for many years. Key to the response to COVID-19 was the non-pharmaceutical intervention (NPI) measures, such as mask mandates or stay-in-place orders. For future pandemic preparedness, it is critical to understand the impact and scope of these interventions. Given the ongoing nature of the pandemic, existing NPI studies covering only the initial portion provide only a narrow view of the impact of NPI measures. This paper describes a dataset of NPI measures taken by counties in the U.S. state of Virginia that include measures taken over the first two years of the pandemic beginning in March 2020. This data enables analyses of NPI measures over a long time period that can produce impact analyses on both the individual NPI effectiveness in slowing the pandemic spread, and the impact of various NPI measures on the behavior and conditions of the different counties and state.more » « less
-
null (Ed.)The effectiveness of social distancing as a disease-slowing measure is dependent on the degree of compliance that individuals demonstrate to such orders. In this ongoing research, we study outdoor pedestrian activity in New York City, specifically using (a) video streams gathered from public traffic cameras (b) dashcam footage from vehicles driving through the city, and (c) mobile phone geo-location data volunteered by local citizens. This project seeks to form a multi-scale map of urban mobility and space occupancy under social distancing policy. The data collected will enable researchers to infer the activities, contexts, origins, and destinations of the people in public spaces. This information can reveal where and, in turn, why stay-at-home orders are and are not being followed. As a work in progress, it is yet too early for detailed findings on this project. However, we report here on several unanticipated factors that have already influenced the course of the project, among them: the death of George Floyd and subsequent protests, data collection challenges, changes in the weather, and the unexpected nature of the progression of COVID-19.more » « less
An official website of the United States government
